Teaching “Unstructured Information Management: Theory and Applications” to Computational Linguistics Students

نویسندگان

  • Iryna Gurevych
  • Christof Müller
  • Torsten Zesch
چکیده

Students in Computational Linguistics often lack experience in building robust and scalable software components. Thus, student projects tend to be unstable and to work only under very special preconditions (e.g., a project has to be installed in a certain directory, or handles only single files instead of whole directories). Furthermore, if students have to build a system from scratch, they have to concentrate on input and output issues, as well as connecting numerous preprocessing components that were not designed to work together. This limits the scope of feasible course tasks to relatively simple ones like implementing yet another tokenizer. When offering the course “Unstructured Information Management: Theory and Applications”1 as part of the B.A./M.A. program of International Studies in Computational Linguistics at the University of Tübingen, our motivation was to familiarize students with fundamental concepts in unstructured information management and Natural Language Processing (NLP) middleware. This should enable students of computational linguistics to work on more challenging tasks, and to gain first experiences with building complex software systems. The course goals were supported by providing basic preprocessing components like a tokenizer or a PoSTagger on the basis of the Unstructured Information Management Architecture (UIMA) (Ferrucci and Lally, 2004). Thus, students of computational linguistics can concentrate on their core competence and work on more challenging tasks both in terms of theoretical complex-

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Application of Learning Theories in Clinical Education

Introduction: The purpose of education is learning. Several theories have been raised about learning, which have tried to explain how learning occurs. They help teachers to choose teaching methods, prepare learning environment and determine students' activities. Given the importance of learning theories in education, this study aimed to review application of learning theories in nursing educati...

متن کامل

Strategies for Teaching “Mixed” Computational Linguistics Classes

Many of the computational linguistics classes at Ohio State draw a diverse crowd of students, who bring different levels of preparation to the classroom. In the same classroom, we often get graduate and undergraduate students from Linguistics, Computer Science, Electrical Engineering and other departments; teaching the same material to all of these students presents an interesting challenge to ...

متن کامل

Using GATE As An Environment For Teaching NLP

In this paper we argue that the GATE architecture and visual development environment can be used as an effective tool for teaching language engineering and computational linguistics. Since GATE comes with a customisable and extendable set of components, it allows students to get hands-on experience with building NLP applications. GATE also has tools for corpus annotation and performance evaluat...

متن کامل

Second Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics

In Fall 2004 I introduced a new course called Applied Natural Language Processing, in which students acquire an understanding of which text analysis techniques are currently feasible for practical applications. The class was intended for interdisciplinary students with a somewhat technical background. This paper describes the topics covered and the programming exercises, emphasizing which aspec...

متن کامل

Some reflections on the teaching of CAT

Synopsis Information processing is both a tool for the professional translator and an area of interest to translators. This implies 2 types of teaching for data processing: knowledge of the field and know-how. The use of machine aids to translation has to respect 2 fundamental principles : the techniques are to be used in the service of content and the document has to be thought as electronic. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007